Using LSTMs with the subwords dataset

In this colab, you'll compare the results of using a model with an Embedding layer and then adding bidirectional LSTM layers.

You'll work with the dataset of subwords for the combined Yelp and Amazon reviews.

You'll use your models to predict the sentiment of new reviews.

In [1]:
import tensorflow as tf

from tensorflow.keras.preprocessing.sequence import pad_sequences

Get the dataset

Start by getting the dataset containing Amazon and Yelp reviews, with their related sentiment (1 for positive, 0 for negative). This dataset was originally extracted from here.

In [2]:
!wget --no-check-certificate \
    https://drive.google.com/uc?id=13ySLC_ue6Umt9RJYSeM2t-V0kCv-4C-P -O /tmp/sentiment.csv
--2020-08-09 02:22:28--  https://drive.google.com/uc?id=13ySLC_ue6Umt9RJYSeM2t-V0kCv-4C-P
Resolving drive.google.com (drive.google.com)... 74.125.195.102, 74.125.195.100, 74.125.195.138, ...
Connecting to drive.google.com (drive.google.com)|74.125.195.102|:443... connected.
HTTP request sent, awaiting response... 302 Moved Temporarily
Location: https://doc-08-ak-docs.googleusercontent.com/docs/securesc/ha0ro937gcuc7l7deffksulhg5h7mbp1/kian8r63dvdde9l0naoaupm9p4vh66d9/1596939675000/11118900490791463723/*/13ySLC_ue6Umt9RJYSeM2t-V0kCv-4C-P [following]
Warning: wildcards not supported in HTTP.
--2020-08-09 02:22:29--  https://doc-08-ak-docs.googleusercontent.com/docs/securesc/ha0ro937gcuc7l7deffksulhg5h7mbp1/kian8r63dvdde9l0naoaupm9p4vh66d9/1596939675000/11118900490791463723/*/13ySLC_ue6Umt9RJYSeM2t-V0kCv-4C-P
Resolving doc-08-ak-docs.googleusercontent.com (doc-08-ak-docs.googleusercontent.com)... 74.125.142.132, 2607:f8b0:400e:c08::84
Connecting to doc-08-ak-docs.googleusercontent.com (doc-08-ak-docs.googleusercontent.com)|74.125.142.132|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 127831 (125K) [text/csv]
Saving to: ‘/tmp/sentiment.csv’

/tmp/sentiment.csv  100%[===================>] 124.83K  --.-KB/s    in 0.001s  

2020-08-09 02:22:29 (118 MB/s) - ‘/tmp/sentiment.csv’ saved [127831/127831]

In [3]:
import pandas as pd

dataset = pd.read_csv('/tmp/sentiment.csv')

# Extract out sentences and labels
sentences = dataset['text'].tolist()
labels = dataset['sentiment'].tolist()
In [4]:
# Print some example sentences and labels
for x in range(2):
  print(sentences[x])
  print(labels[x])
  print("\n")
So there is no way for me to plug it in here in the US unless I go by a converter.
0


Good case Excellent value.
1


Create a subwords dataset

We will use the Amazon and Yelp reviews dataset with tensorflow_datasets's SubwordTextEncoder functionality.

SubwordTextEncoder.build_from_corpus() will create a tokenizer for us. You could also use this functionality to get subwords from a much larger corpus of text as well, but we'll just use our existing dataset here.

We'll create a subword vocab_size of only the 1,000 most common subwords, as well as cutting off each subword to be at most 5 characters.

Check out the related documentation for the the subword text encoder here.

In [5]:
import tensorflow_datasets as tfds

vocab_size = 1000
tokenizer = tfds.features.text.SubwordTextEncoder.build_from_corpus(sentences, vocab_size, max_subword_length=5)

# How big is the vocab size?
print("Vocab size is ", tokenizer.vocab_size)
Vocab size is  999
In [6]:
# Check that the tokenizer works appropriately
num = 5
print(sentences[num])
encoded = tokenizer.encode(sentences[num])
print(encoded)
I have to jiggle the plug to get it to line up right to get decent volume.
[4, 31, 6, 849, 162, 450, 12, 1, 600, 438, 775, 6, 175, 14, 6, 55, 213, 159, 474, 775, 6, 175, 614, 380, 295, 148, 72, 789]
In [7]:
# Separately print out each subword, decoded
for i in encoded:
  print(tokenizer.decode([i]))
I 
have 
to 
j
ig
gl
e 
the 
pl
ug
 
to 
get 
it 
to 
li
ne 
up 
right
 
to 
get 
dec
ent 
vo
lu
me
.

Replace sentence data with encoded subwords

Now, we'll create the sequences to be used for training by actually encoding each of the individual sentences. This is equivalent to text_to_sequences with the Tokenizer we used in earlier exercises.

In [8]:
for i, sentence in enumerate(sentences):
  sentences[i] = tokenizer.encode(sentence)
In [9]:
# Check the sentences are appropriately replaced
print(sentences[5])
[4, 31, 6, 849, 162, 450, 12, 1, 600, 438, 775, 6, 175, 14, 6, 55, 213, 159, 474, 775, 6, 175, 614, 380, 295, 148, 72, 789]

Final pre-processing

Before training, we still need to pad the sequences, as well as split into training and test sets.

In [10]:
import numpy as np

max_length = 50
trunc_type='post'
padding_type='post'

# Pad all sequences
sequences_padded = pad_sequences(sentences, maxlen=max_length, 
                                 padding=padding_type, truncating=trunc_type)

# Separate out the sentences and labels into training and test sets
training_size = int(len(sentences) * 0.8)

training_sequences = sequences_padded[0:training_size]
testing_sequences = sequences_padded[training_size:]
training_labels = labels[0:training_size]
testing_labels = labels[training_size:]

# Make labels into numpy arrays for use with the network later
training_labels_final = np.array(training_labels)
testing_labels_final = np.array(testing_labels)

Create the model using an Embedding

In [11]:
embedding_dim = 16

model = tf.keras.Sequential([
    tf.keras.layers.Embedding(vocab_size, embedding_dim, input_length=max_length),
    tf.keras.layers.GlobalAveragePooling1D(), 
    tf.keras.layers.Dense(6, activation='relu'),
    tf.keras.layers.Dense(1, activation='sigmoid')
])

model.summary()
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
embedding (Embedding)        (None, 50, 16)            16000     
_________________________________________________________________
global_average_pooling1d (Gl (None, 16)                0         
_________________________________________________________________
dense (Dense)                (None, 6)                 102       
_________________________________________________________________
dense_1 (Dense)              (None, 1)                 7         
=================================================================
Total params: 16,109
Trainable params: 16,109
Non-trainable params: 0
_________________________________________________________________

Train the model

In [12]:
num_epochs = 30
model.compile(loss='binary_crossentropy',optimizer='adam',metrics=['accuracy'])
history = model.fit(training_sequences, training_labels_final, epochs=num_epochs, validation_data=(testing_sequences, testing_labels_final))
Epoch 1/30
50/50 [==============================] - 0s 6ms/step - loss: 0.6925 - accuracy: 0.5361 - val_loss: 0.6936 - val_accuracy: 0.4737
Epoch 2/30
50/50 [==============================] - 0s 4ms/step - loss: 0.6897 - accuracy: 0.5612 - val_loss: 0.6934 - val_accuracy: 0.4586
Epoch 3/30
50/50 [==============================] - 0s 4ms/step - loss: 0.6834 - accuracy: 0.5756 - val_loss: 0.6928 - val_accuracy: 0.4787
Epoch 4/30
50/50 [==============================] - 0s 3ms/step - loss: 0.6733 - accuracy: 0.6089 - val_loss: 0.6820 - val_accuracy: 0.5489
Epoch 5/30
50/50 [==============================] - 0s 3ms/step - loss: 0.6564 - accuracy: 0.6924 - val_loss: 0.6713 - val_accuracy: 0.5639
Epoch 6/30
50/50 [==============================] - 0s 3ms/step - loss: 0.6317 - accuracy: 0.7257 - val_loss: 0.6525 - val_accuracy: 0.6065
Epoch 7/30
50/50 [==============================] - 0s 3ms/step - loss: 0.5995 - accuracy: 0.7734 - val_loss: 0.6282 - val_accuracy: 0.6767
Epoch 8/30
50/50 [==============================] - 0s 3ms/step - loss: 0.5595 - accuracy: 0.8029 - val_loss: 0.6051 - val_accuracy: 0.6892
Epoch 9/30
50/50 [==============================] - 0s 3ms/step - loss: 0.5200 - accuracy: 0.8192 - val_loss: 0.5772 - val_accuracy: 0.7218
Epoch 10/30
50/50 [==============================] - 0s 3ms/step - loss: 0.4781 - accuracy: 0.8418 - val_loss: 0.5616 - val_accuracy: 0.7243
Epoch 11/30
50/50 [==============================] - 0s 3ms/step - loss: 0.4405 - accuracy: 0.8588 - val_loss: 0.5362 - val_accuracy: 0.7469
Epoch 12/30
50/50 [==============================] - 0s 3ms/step - loss: 0.4085 - accuracy: 0.8581 - val_loss: 0.5157 - val_accuracy: 0.7719
Epoch 13/30
50/50 [==============================] - 0s 4ms/step - loss: 0.3800 - accuracy: 0.8701 - val_loss: 0.5140 - val_accuracy: 0.7444
Epoch 14/30
50/50 [==============================] - 0s 3ms/step - loss: 0.3538 - accuracy: 0.8745 - val_loss: 0.5203 - val_accuracy: 0.7444
Epoch 15/30
50/50 [==============================] - 0s 4ms/step - loss: 0.3335 - accuracy: 0.8807 - val_loss: 0.5277 - val_accuracy: 0.7318
Epoch 16/30
50/50 [==============================] - 0s 3ms/step - loss: 0.3150 - accuracy: 0.8876 - val_loss: 0.4940 - val_accuracy: 0.7594
Epoch 17/30
50/50 [==============================] - 0s 3ms/step - loss: 0.2981 - accuracy: 0.8970 - val_loss: 0.4978 - val_accuracy: 0.7619
Epoch 18/30
50/50 [==============================] - 0s 3ms/step - loss: 0.2827 - accuracy: 0.9014 - val_loss: 0.5161 - val_accuracy: 0.7444
Epoch 19/30
50/50 [==============================] - 0s 3ms/step - loss: 0.2687 - accuracy: 0.9071 - val_loss: 0.5112 - val_accuracy: 0.7544
Epoch 20/30
50/50 [==============================] - 0s 4ms/step - loss: 0.2566 - accuracy: 0.9090 - val_loss: 0.5173 - val_accuracy: 0.7569
Epoch 21/30
50/50 [==============================] - 0s 3ms/step - loss: 0.2448 - accuracy: 0.9171 - val_loss: 0.5226 - val_accuracy: 0.7494
Epoch 22/30
50/50 [==============================] - 0s 3ms/step - loss: 0.2351 - accuracy: 0.9184 - val_loss: 0.5217 - val_accuracy: 0.7544
Epoch 23/30
50/50 [==============================] - 0s 4ms/step - loss: 0.2237 - accuracy: 0.9222 - val_loss: 0.5180 - val_accuracy: 0.7619
Epoch 24/30
50/50 [==============================] - 0s 3ms/step - loss: 0.2153 - accuracy: 0.9272 - val_loss: 0.5360 - val_accuracy: 0.7519
Epoch 25/30
50/50 [==============================] - 0s 4ms/step - loss: 0.2087 - accuracy: 0.9272 - val_loss: 0.5445 - val_accuracy: 0.7419
Epoch 26/30
50/50 [==============================] - 0s 3ms/step - loss: 0.2000 - accuracy: 0.9291 - val_loss: 0.5488 - val_accuracy: 0.7469
Epoch 27/30
50/50 [==============================] - 0s 3ms/step - loss: 0.1926 - accuracy: 0.9372 - val_loss: 0.5734 - val_accuracy: 0.7393
Epoch 28/30
50/50 [==============================] - 0s 3ms/step - loss: 0.1850 - accuracy: 0.9385 - val_loss: 0.5724 - val_accuracy: 0.7419
Epoch 29/30
50/50 [==============================] - 0s 3ms/step - loss: 0.1782 - accuracy: 0.9473 - val_loss: 0.5768 - val_accuracy: 0.7469
Epoch 30/30
50/50 [==============================] - 0s 3ms/step - loss: 0.1721 - accuracy: 0.9498 - val_loss: 0.5972 - val_accuracy: 0.7444

Plot the accuracy and loss

In [13]:
import matplotlib.pyplot as plt


def plot_graphs(history, string):
  plt.plot(history.history[string])
  plt.plot(history.history['val_'+string])
  plt.xlabel("Epochs")
  plt.ylabel(string)
  plt.legend([string, 'val_'+string])
  plt.show()
  
plot_graphs(history, "accuracy")
plot_graphs(history, "loss")

Define a function to predict the sentiment of reviews

We'll be creating models with some differences and will use each model to predict the sentiment of some new reviews.

To save time, create a function that will take in a model and some new reviews, and print out the sentiment of each reviews.

The higher the sentiment value is to 1, the more positive the review is.

In [14]:
# Define a function to take a series of reviews
# and predict whether each one is a positive or negative review

# max_length = 100 # previously defined

def predict_review(model, new_sentences, maxlen=max_length, show_padded_sequence=True ):
  # Keep the original sentences so that we can keep using them later
  # Create an array to hold the encoded sequences
  new_sequences = []

  # Convert the new reviews to sequences
  for i, frvw in enumerate(new_sentences):
    new_sequences.append(tokenizer.encode(frvw))

  trunc_type='post' 
  padding_type='post'

  # Pad all sequences for the new reviews
  new_reviews_padded = pad_sequences(new_sequences, maxlen=max_length, 
                                 padding=padding_type, truncating=trunc_type)             

  classes = model.predict(new_reviews_padded)

  # The closer the class is to 1, the more positive the review is
  for x in range(len(new_sentences)):
    
    # We can see the padded sequence if desired
    # Print the sequence
    if (show_padded_sequence):
      print(new_reviews_padded[x])
    # Print the review as text
    print(new_sentences[x])
    # Print its predicted class
    print(classes[x])
    print("\n")
In [15]:
# Use the model to predict some reviews   
fake_reviews = ["I love this phone", 
                "Everything was cold",
                "Everything was hot exactly as I wanted", 
                "Everything was green", 
                "the host seated us immediately",
                "they gave us free chocolate cake", 
                "we couldn't hear each other talk because of the shouting in the kitchen"
              ]

predict_review(model, fake_reviews)
[  4 281  16  25   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
I love this phone
[0.9025642]


[812 227 864 100 775   9 525 843   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
Everything was cold
[0.10863601]


[812 227 864 100 775   9 109   8 333 731  24  61   4 171  59  77   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
Everything was hot exactly as I wanted
[0.33902964]


[812 227 864 100 775   9 157 359 853   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
Everything was green
[0.09309307]


[  1 109 228 540 237 635 241 423 340  89 298   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
the host seated us immediately
[0.4488544]


[154 242  47 635 341  12 569 547 147 319 775 125  85   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
they gave us free chocolate cake
[0.73490804]


[158 190 853 782   8 607 775 210 232 146 775 470  67 305 101  15   1 328
 296  26  19   1 661 641 195   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
we couldn't hear each other talk because of the shouting in the kitchen
[0.01328599]


Define a function to train and show the results of models with different layers

In the rest of this colab, we will define models, and then see the results.

Define a function that will take the model, compile it, train it, graph the accuracy and loss, and then predict some results.

In [16]:
def fit_model_now (model, sentences) :
  model.compile(loss='binary_crossentropy',optimizer='adam',metrics=['accuracy'])
  model.summary()
  history = model.fit(training_sequences, training_labels_final, epochs=num_epochs, 
                      validation_data=(testing_sequences, testing_labels_final))
  return history

def plot_results (history):
  plot_graphs(history, "accuracy")
  plot_graphs(history, "loss")

def fit_model_and_show_results (model, sentences):
  history = fit_model_now(model, sentences)
  plot_results(history)
  predict_review(model, sentences)

Add a bidirectional LSTM

Create a new model that uses a bidirectional LSTM.

Then use the function we have already defined to compile the model, train it, graph the accuracy and loss, then predict some results.

In [17]:
# Define the model
model_bidi_lstm = tf.keras.Sequential([
    tf.keras.layers.Embedding(vocab_size, embedding_dim, input_length=max_length),
    tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(embedding_dim)), 
    tf.keras.layers.Dense(6, activation='relu'), 
    tf.keras.layers.Dense(1, activation='sigmoid')
])

# Compile and train the model and then show the predictions for our extra sentences
fit_model_and_show_results(model_bidi_lstm, fake_reviews)
Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
embedding_1 (Embedding)      (None, 50, 16)            16000     
_________________________________________________________________
bidirectional (Bidirectional (None, 32)                4224      
_________________________________________________________________
dense_2 (Dense)              (None, 6)                 198       
_________________________________________________________________
dense_3 (Dense)              (None, 1)                 7         
=================================================================
Total params: 20,429
Trainable params: 20,429
Non-trainable params: 0
_________________________________________________________________
Epoch 1/30
50/50 [==============================] - 1s 22ms/step - loss: 0.6919 - accuracy: 0.5154 - val_loss: 0.6990 - val_accuracy: 0.4110
Epoch 2/30
50/50 [==============================] - 1s 10ms/step - loss: 0.6751 - accuracy: 0.5512 - val_loss: 0.6687 - val_accuracy: 0.6316
Epoch 3/30
50/50 [==============================] - 0s 10ms/step - loss: 0.5267 - accuracy: 0.7784 - val_loss: 0.5080 - val_accuracy: 0.7669
Epoch 4/30
50/50 [==============================] - 0s 9ms/step - loss: 0.3710 - accuracy: 0.8519 - val_loss: 0.4966 - val_accuracy: 0.7544
Epoch 5/30
50/50 [==============================] - 0s 9ms/step - loss: 0.2943 - accuracy: 0.8901 - val_loss: 0.5170 - val_accuracy: 0.7619
Epoch 6/30
50/50 [==============================] - 0s 9ms/step - loss: 0.2324 - accuracy: 0.9127 - val_loss: 0.6464 - val_accuracy: 0.7419
Epoch 7/30
50/50 [==============================] - 1s 11ms/step - loss: 0.1893 - accuracy: 0.9391 - val_loss: 0.6523 - val_accuracy: 0.7569
Epoch 8/30
50/50 [==============================] - 1s 10ms/step - loss: 0.1586 - accuracy: 0.9466 - val_loss: 0.6563 - val_accuracy: 0.7544
Epoch 9/30
50/50 [==============================] - 1s 11ms/step - loss: 0.1513 - accuracy: 0.9479 - val_loss: 0.9043 - val_accuracy: 0.7494
Epoch 10/30
50/50 [==============================] - 0s 9ms/step - loss: 0.1325 - accuracy: 0.9586 - val_loss: 0.9822 - val_accuracy: 0.7519
Epoch 11/30
50/50 [==============================] - 0s 9ms/step - loss: 0.1215 - accuracy: 0.9636 - val_loss: 0.8725 - val_accuracy: 0.7494
Epoch 12/30
50/50 [==============================] - 0s 9ms/step - loss: 0.0982 - accuracy: 0.9711 - val_loss: 0.9002 - val_accuracy: 0.7469
Epoch 13/30
50/50 [==============================] - 1s 10ms/step - loss: 0.0786 - accuracy: 0.9824 - val_loss: 1.0444 - val_accuracy: 0.7419
Epoch 14/30
50/50 [==============================] - 0s 9ms/step - loss: 0.0833 - accuracy: 0.9812 - val_loss: 1.0630 - val_accuracy: 0.7594
Epoch 15/30
50/50 [==============================] - 0s 9ms/step - loss: 0.0886 - accuracy: 0.9730 - val_loss: 1.2106 - val_accuracy: 0.7368
Epoch 16/30
50/50 [==============================] - 0s 10ms/step - loss: 0.0750 - accuracy: 0.9799 - val_loss: 1.1668 - val_accuracy: 0.7569
Epoch 17/30
50/50 [==============================] - 1s 10ms/step - loss: 0.0551 - accuracy: 0.9912 - val_loss: 1.0425 - val_accuracy: 0.7669
Epoch 18/30
50/50 [==============================] - 0s 9ms/step - loss: 0.0567 - accuracy: 0.9900 - val_loss: 1.0160 - val_accuracy: 0.7444
Epoch 19/30
50/50 [==============================] - 0s 9ms/step - loss: 0.0482 - accuracy: 0.9925 - val_loss: 1.4153 - val_accuracy: 0.7343
Epoch 20/30
50/50 [==============================] - 0s 9ms/step - loss: 0.0412 - accuracy: 0.9931 - val_loss: 1.2101 - val_accuracy: 0.7469
Epoch 21/30
50/50 [==============================] - 0s 9ms/step - loss: 0.0375 - accuracy: 0.9944 - val_loss: 1.2740 - val_accuracy: 0.7444
Epoch 22/30
50/50 [==============================] - 0s 10ms/step - loss: 0.0332 - accuracy: 0.9950 - val_loss: 1.3241 - val_accuracy: 0.7444
Epoch 23/30
50/50 [==============================] - 1s 10ms/step - loss: 0.0362 - accuracy: 0.9931 - val_loss: 1.4030 - val_accuracy: 0.7469
Epoch 24/30
50/50 [==============================] - 1s 10ms/step - loss: 0.0810 - accuracy: 0.9761 - val_loss: 1.5282 - val_accuracy: 0.7293
Epoch 25/30
50/50 [==============================] - 1s 10ms/step - loss: 0.0729 - accuracy: 0.9805 - val_loss: 1.5500 - val_accuracy: 0.7193
Epoch 26/30
50/50 [==============================] - 0s 9ms/step - loss: 0.0625 - accuracy: 0.9837 - val_loss: 1.6487 - val_accuracy: 0.7143
Epoch 27/30
50/50 [==============================] - 0s 9ms/step - loss: 0.0570 - accuracy: 0.9868 - val_loss: 1.6483 - val_accuracy: 0.7243
Epoch 28/30
50/50 [==============================] - 0s 9ms/step - loss: 0.0419 - accuracy: 0.9912 - val_loss: 1.4770 - val_accuracy: 0.7469
Epoch 29/30
50/50 [==============================] - 0s 9ms/step - loss: 0.0316 - accuracy: 0.9956 - val_loss: 1.5001 - val_accuracy: 0.7469
Epoch 30/30
50/50 [==============================] - 0s 9ms/step - loss: 0.0296 - accuracy: 0.9956 - val_loss: 1.5527 - val_accuracy: 0.7444
[  4 281  16  25   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
I love this phone
[0.99872476]


[812 227 864 100 775   9 525 843   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
Everything was cold
[0.01407116]


[812 227 864 100 775   9 109   8 333 731  24  61   4 171  59  77   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
Everything was hot exactly as I wanted
[0.20281683]


[812 227 864 100 775   9 157 359 853   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
Everything was green
[0.02202881]


[  1 109 228 540 237 635 241 423 340  89 298   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
the host seated us immediately
[0.9954371]


[154 242  47 635 341  12 569 547 147 319 775 125  85   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
they gave us free chocolate cake
[0.9972101]


[158 190 853 782   8 607 775 210 232 146 775 470  67 305 101  15   1 328
 296  26  19   1 661 641 195   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
we couldn't hear each other talk because of the shouting in the kitchen
[0.01071847]


Use multiple bidirectional layers

Now let's see if we get any improvements from adding another Bidirectional LSTM layer to the model.

Notice that the first Bidirectionl LSTM layer returns a sequence.

In [18]:
model_multiple_bidi_lstm = tf.keras.Sequential([
    tf.keras.layers.Embedding(vocab_size, embedding_dim, input_length=max_length),
    tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(embedding_dim, 
                                                       return_sequences=True)), 
    tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(embedding_dim)),
    tf.keras.layers.Dense(6, activation='relu'),
    tf.keras.layers.Dense(1, activation='sigmoid')
])

fit_model_and_show_results(model_multiple_bidi_lstm, fake_reviews)
Model: "sequential_2"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
embedding_2 (Embedding)      (None, 50, 16)            16000     
_________________________________________________________________
bidirectional_1 (Bidirection (None, 50, 32)            4224      
_________________________________________________________________
bidirectional_2 (Bidirection (None, 32)                6272      
_________________________________________________________________
dense_4 (Dense)              (None, 6)                 198       
_________________________________________________________________
dense_5 (Dense)              (None, 1)                 7         
=================================================================
Total params: 26,701
Trainable params: 26,701
Non-trainable params: 0
_________________________________________________________________
Epoch 1/30
50/50 [==============================] - 2s 38ms/step - loss: 0.6914 - accuracy: 0.5223 - val_loss: 0.7030 - val_accuracy: 0.4110
Epoch 2/30
50/50 [==============================] - 1s 14ms/step - loss: 0.6588 - accuracy: 0.5800 - val_loss: 0.6652 - val_accuracy: 0.5639
Epoch 3/30
50/50 [==============================] - 1s 17ms/step - loss: 0.4784 - accuracy: 0.8123 - val_loss: 0.5752 - val_accuracy: 0.7368
Epoch 4/30
50/50 [==============================] - 1s 17ms/step - loss: 0.3339 - accuracy: 0.8738 - val_loss: 0.6423 - val_accuracy: 0.7569
Epoch 5/30
50/50 [==============================] - 1s 16ms/step - loss: 0.2613 - accuracy: 0.8933 - val_loss: 0.6480 - val_accuracy: 0.7343
Epoch 6/30
50/50 [==============================] - 1s 15ms/step - loss: 0.1982 - accuracy: 0.9291 - val_loss: 0.6755 - val_accuracy: 0.7519
Epoch 7/30
50/50 [==============================] - 1s 14ms/step - loss: 0.1700 - accuracy: 0.9410 - val_loss: 0.7369 - val_accuracy: 0.7569
Epoch 8/30
50/50 [==============================] - 1s 14ms/step - loss: 0.1342 - accuracy: 0.9605 - val_loss: 0.7639 - val_accuracy: 0.7794
Epoch 9/30
50/50 [==============================] - 1s 15ms/step - loss: 0.1097 - accuracy: 0.9705 - val_loss: 0.9304 - val_accuracy: 0.7494
Epoch 10/30
50/50 [==============================] - 1s 14ms/step - loss: 0.0999 - accuracy: 0.9743 - val_loss: 0.9490 - val_accuracy: 0.7544
Epoch 11/30
50/50 [==============================] - 1s 14ms/step - loss: 0.0826 - accuracy: 0.9805 - val_loss: 1.0866 - val_accuracy: 0.7368
Epoch 12/30
50/50 [==============================] - 1s 15ms/step - loss: 0.0727 - accuracy: 0.9831 - val_loss: 1.0949 - val_accuracy: 0.7519
Epoch 13/30
50/50 [==============================] - 1s 16ms/step - loss: 0.0722 - accuracy: 0.9831 - val_loss: 1.3035 - val_accuracy: 0.7318
Epoch 14/30
50/50 [==============================] - 1s 14ms/step - loss: 0.1060 - accuracy: 0.9699 - val_loss: 0.8344 - val_accuracy: 0.7744
Epoch 15/30
50/50 [==============================] - 1s 14ms/step - loss: 0.1199 - accuracy: 0.9648 - val_loss: 0.8988 - val_accuracy: 0.7619
Epoch 16/30
50/50 [==============================] - 1s 14ms/step - loss: 0.0821 - accuracy: 0.9812 - val_loss: 1.0178 - val_accuracy: 0.7569
Epoch 17/30
50/50 [==============================] - 1s 14ms/step - loss: 0.0777 - accuracy: 0.9812 - val_loss: 1.0891 - val_accuracy: 0.7669
Epoch 18/30
50/50 [==============================] - 1s 14ms/step - loss: 0.0833 - accuracy: 0.9774 - val_loss: 0.9208 - val_accuracy: 0.7569
Epoch 19/30
50/50 [==============================] - 1s 14ms/step - loss: 0.0641 - accuracy: 0.9862 - val_loss: 1.1231 - val_accuracy: 0.7393
Epoch 20/30
50/50 [==============================] - 1s 15ms/step - loss: 0.0571 - accuracy: 0.9881 - val_loss: 1.1479 - val_accuracy: 0.7594
Epoch 21/30
50/50 [==============================] - 1s 14ms/step - loss: 0.0550 - accuracy: 0.9887 - val_loss: 1.1901 - val_accuracy: 0.7594
Epoch 22/30
50/50 [==============================] - 1s 14ms/step - loss: 0.0544 - accuracy: 0.9887 - val_loss: 1.2309 - val_accuracy: 0.7594
Epoch 23/30
50/50 [==============================] - 1s 14ms/step - loss: 0.0544 - accuracy: 0.9887 - val_loss: 1.2504 - val_accuracy: 0.7619
Epoch 24/30
50/50 [==============================] - 1s 14ms/step - loss: 0.0542 - accuracy: 0.9887 - val_loss: 1.2719 - val_accuracy: 0.7619
Epoch 25/30
50/50 [==============================] - 1s 14ms/step - loss: 0.0540 - accuracy: 0.9887 - val_loss: 1.2795 - val_accuracy: 0.7644
Epoch 26/30
50/50 [==============================] - 1s 14ms/step - loss: 0.0541 - accuracy: 0.9887 - val_loss: 1.2968 - val_accuracy: 0.7644
Epoch 27/30
50/50 [==============================] - 1s 14ms/step - loss: 0.0538 - accuracy: 0.9887 - val_loss: 1.3081 - val_accuracy: 0.7644
Epoch 28/30
50/50 [==============================] - 1s 14ms/step - loss: 0.0537 - accuracy: 0.9887 - val_loss: 1.3254 - val_accuracy: 0.7669
Epoch 29/30
50/50 [==============================] - 1s 15ms/step - loss: 0.0536 - accuracy: 0.9887 - val_loss: 1.3361 - val_accuracy: 0.7619
Epoch 30/30
50/50 [==============================] - 1s 16ms/step - loss: 0.0537 - accuracy: 0.9887 - val_loss: 1.3501 - val_accuracy: 0.7619
[  4 281  16  25   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
I love this phone
[0.9999465]


[812 227 864 100 775   9 525 843   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
Everything was cold
[0.02729134]


[812 227 864 100 775   9 109   8 333 731  24  61   4 171  59  77   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
Everything was hot exactly as I wanted
[0.9990119]


[812 227 864 100 775   9 157 359 853   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
Everything was green
[0.02564479]


[  1 109 228 540 237 635 241 423 340  89 298   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
the host seated us immediately
[0.02473117]


[154 242  47 635 341  12 569 547 147 319 775 125  85   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
they gave us free chocolate cake
[0.99992454]


[158 190 853 782   8 607 775 210 232 146 775 470  67 305 101  15   1 328
 296  26  19   1 661 641 195   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
we couldn't hear each other talk because of the shouting in the kitchen
[0.02545766]


Compare predictions for all the models

It can be hard to see which model gives a better prediction for different reviews when you examine each model separately. So for comparison purposes, here we define some more reviews and print out the predictions that each of the three models gives for each review:

  • Embeddings and a Global Average Pooling layer
  • Embeddings and a Bidirectional LSTM layer
  • Embeddings and two Bidirectional LSTM layers

The results are not always what you might expect. The input dataset is fairly small, it has less than 2000 reviews. Some of the reviews are fairly short, and some of the short ones are fairly repetitive which reduces their impact on improving the model, such as these two reviews:

  • Bad Quality.
  • Low Quality.

Feel free to add more reviews of your own, or change the reviews. The results will depend on the combination of words in the reviews, and how well they match to reviews in the training set.

How do the different models handle things like "wasn't good" which contains a positive word (good) but is a poor review?

In [19]:
my_reviews =["lovely", "dreadful", "stay away",
             "everything was hot exactly as I wanted",
             "everything was not exactly as I wanted",
             "they gave us free chocolate cake",
             "I've never eaten anything so spicy in my life, my throat burned for hours",
             "for a phone that is as expensive as this one I expect it to be much easier to use than this thing is",
             "we left there very full for a low price so I'd say you just can't go wrong at this place",
             "that place does not have quality meals and it isn't a good place to go for dinner",
             ]
In [20]:
print("===================================\n","Embeddings only:\n", "===================================",)
predict_review(model, my_reviews, show_padded_sequence=False)
===================================
 Embeddings only:
 ===================================
lovely
[0.92708254]


dreadful
[0.32230794]


stay away
[0.6518728]


everything was hot exactly as I wanted
[0.7750879]


everything was not exactly as I wanted
[0.55371344]


they gave us free chocolate cake
[0.73490804]


I've never eaten anything so spicy in my life, my throat burned for hours
[0.01812866]


for a phone that is as expensive as this one I expect it to be much easier to use than this thing is
[0.6021914]


we left there very full for a low price so I'd say you just can't go wrong at this place
[0.79115725]


that place does not have quality meals and it isn't a good place to go for dinner
[0.88204527]


In [21]:
print("===================================\n", "With a single bidirectional LSTM:\n", "===================================")
predict_review(model_bidi_lstm, my_reviews, show_padded_sequence=False)
===================================
 With a single bidirectional LSTM:
 ===================================
lovely
[0.9992417]


dreadful
[0.00907197]


stay away
[0.0233862]


everything was hot exactly as I wanted
[0.6811935]


everything was not exactly as I wanted
[0.01769193]


they gave us free chocolate cake
[0.9972101]


I've never eaten anything so spicy in my life, my throat burned for hours
[0.01404104]


for a phone that is as expensive as this one I expect it to be much easier to use than this thing is
[0.01039576]


we left there very full for a low price so I'd say you just can't go wrong at this place
[0.9999728]


that place does not have quality meals and it isn't a good place to go for dinner
[0.99948883]


In [22]:
print("===================================\n","With two bidirectional LSTMs:\n", "===================================")
predict_review(model_multiple_bidi_lstm, my_reviews, show_padded_sequence=False)
===================================
 With two bidirectional LSTMs:
 ===================================
lovely
[0.99993587]


dreadful
[0.0210799]


stay away
[0.33392268]


everything was hot exactly as I wanted
[0.999716]


everything was not exactly as I wanted
[0.04492097]


they gave us free chocolate cake
[0.99992454]


I've never eaten anything so spicy in my life, my throat burned for hours
[0.0302036]


for a phone that is as expensive as this one I expect it to be much easier to use than this thing is
[0.02199314]


we left there very full for a low price so I'd say you just can't go wrong at this place
[0.9997241]


that place does not have quality meals and it isn't a good place to go for dinner
[0.02401786]


In [22]: